Congested crowd instance localization with dilated convolutional swin transformer
نویسندگان
چکیده
Crowd localization is a new computer vision task, evolved from crowd counting. Different the latter, it provides more precise location information for each instance, not just counting numbers whole scene, which brings greater challenges, especially in extremely congested scenes. In this paper, we focus on how to achieve instance high-density scenes, and alleviate problem that feature extraction ability of traditional model reduced due target occlusion, image blur, etc. To end, propose Dilated Convolutional Swin Transformer (DCST) Specifically, window-based transformer introduced into effectively improves capacity representation learning. Then, well-designed dilated convolutional module inserted some different stages enhance large-range contextual information. Extensive experiments evidence effectiveness proposed methods state-of-the-art performance five popular datasets. Especially, achieves F1-measure 77.5% MAE 84.2 terms performance, respectively.
منابع مشابه
Fully Convolutional Crowd Counting on Highly Congested Scenes
In this paper we advance the state-of-the-art for crowd counting in high density scenes by further exploring the idea of a fully convolutional crowd counting model introduced by (Zhang et al., 2016). Producing an accurate and robust crowd count estimator using computer vision techniques has attracted significant research interest in recent years. Applications for crowd counting systems exist in...
متن کاملCSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present highquality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CN...
متن کاملInstance-Sensitive Fully Convolutional Networks
Fully convolutional networks (FCNs) have been proven very successful for semantic segmentation, but the FCN outputs are unaware of object instances. In this paper, we develop FCNs that are capable of proposing instance-level segment candidates. In contrast to the previous FCN that generates one score map, our FCN is designed to compute a small set of instance-sensitive score maps, each of which...
متن کاملRelation Extraction with Multi-instance Multi-label Convolutional Neural Networks
Distant supervision is an efficient approach that automatically generates labeled data for relation extraction (RE). Traditional distantly supervised RE systems rely heavily on handcrafted features, and hence suffer from error propagation. Recently, a neural network architecture has been proposed to automatically extract features for relation classification. However, this approach follows the t...
متن کاملA Baseline for Visual Instance Retrieval with Deep Convolutional Networks
This work presents simple pipelines for visual image retrieval exploiting image representations based on convolutional networks (ConvNets), and demonstrates that ConvNet image representations outperform other state-of-the-art image representations on six standard image retrieval datasets. ConvNet based image features have increasingly permeated the field of computer vision and are replacing han...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2022
ISSN: ['0925-2312', '1872-8286']
DOI: https://doi.org/10.1016/j.neucom.2022.09.113